Confirm Delete?
Are you sure you want to remove from the report?

Data Preview

Tail

  system_calendar_key_N product_id sales_dollars_value sales_units_value sales_lbs_value Vendor Claim_id Claim_name date platform searchVolume week_number year_new
4526177 20181027 47536 8.000000 2 3 Private Label 0 No Claim NaT nan nan nan nan
4526178 20181027 47539 391.000000 39 68 Private Label 0 No Claim NaT nan nan nan nan
4526179 20181027 47543 105.000000 59 48 Private Label 8 low carb 2019-09-30 00:00:00 walmart 42.000000 40.000000 2019.000000
4526180 20181027 47544 3720.000000 1246 4361 Private Label 0 No Claim NaT nan nan nan nan
4526181 20181027 47545 1729.000000 2016 378 Private Label 227 salmon 2019-07-11 00:00:00 walmart 42.000000 28.000000 2019.000000

Pre Processing

Pre Processing - Imputations

Pre Processing - Imputations - Missing

  no_of_missing imputed with
date 2796557 Not Imputed
searchVolume 2796557 median
week_number 2796557 median
platform 2796557 mode

Pre Processing - Imputations - Infinitys

No Infinity values

Pre Processing - Encoding

  original_type new_columns method
Vendor category target_encoded_Vendor target encoded
Claim_name category target_encoded_Claim_name target encoded
platform category target_encoded_platform target encoded

Pre Processing - Encoded Mappings

Pre Processing - Encoded Mappings - Vendor

  encoded values
Vendor  
A 61750.959917
B 39558.815754
D 48861.144854
E 86181.821483
F 30266.303846
G 86227.455719
H 22957.239303
Others 8444.867410
Private Label 11005.184886

Pre Processing - Encoded Mappings - Claim Name

  encoded values
Claim_name  
No Claim 28727.111414
american gumbo 5818.260216
american southwest style 15267.730213
apple cinnamon 15477.187623
beans 5928.162127
beef hamburger 812.664179
blueberry 14637.362056
bone health 3436.753788
brown ale 5033.345939
buckwheat 8388.825453
cherry 923.080275
chicken 5459.513713
cocoa 1071.791209
convenience - easy-to-prepare 2457.032690
cookie 23814.158296
crab 4815.734170
energy/alertness 1463.182359
ethical - packaging 16630.730241
ethnic & exotic 23146.678131
french bisque 10226.889405
gingerbread 6364.437770
gmo free 14665.686773
halal 4337.050725
herbs 15143.724621
high/source of protein 4259.732247
low calorie 44263.891973
low carb 9757.032854
low sodium 8064.472025
low sugar 5069.789390
mackerel 517.027397
no additives/preservatives 16500.746148
nuts 3056.388896
peanut 8599.744731
pizza 34696.640181
pollock 25076.922129
poultry 6058.704206
prebiotic 750.491803
red raspberry 764.764706
salmon 48528.062483
scallop 7099.241866
soy foods 57203.664029
stroganoff 37362.982937
tilapia 765.604413
tuna 2635.494596
vegetarian 3938.537104

Pre Processing - Encoded Mappings - Platform

  encoded values
platform  
amazon 16500.746148
chewy 4259.732247
google 10746.825390
walmart 22655.396990

Health Analysis

Health Plot

Missing Plot

Missing Value Summary

  Variable Name No of Missing (out of 4526182) Per of Missing
0 date 2796557 61.786225
1 platform 2796557 61.786225
2 searchVolume 2796557 61.786225
3 week_number 2796557 61.786225
4 year_new 2796557 61.786225

Duplicate Columns

No duplicate variables

Outliers In Features

Data Shape:(4526182, 13)
feature < (mean-3*std) > (mean+3*std) < (1stQ - 1.5 * IQR) > (3rdQ + 1.5 * IQR) -inf +inf
sales_dollars_value 0 70655 0 631250 0 0
sales_units_value 0 47393 0 668926 0 0
sales_lbs_value 0 35562 0 715431 0 0
Claim_id 0 66482 0 810875 0 0
searchVolume 0 32416 0 204899 0 0
week_number 12896 0 0 0 0 0

Feature Analysis

Summary Stats

Summary Stats - Numeric Variables

  Variable Name Datatype No of Unique Samples Mean Standard Deviation Min 25th percentile Median 75th percentile Max
0 Claim_id float64 44 [0.0, 158.0, 227.0, 432.0, 185.0] 63.160495 124.075774 0.000000 0.000000 8.000000 40.000000 435.715826
1 product_id float64 42616 [1.0, 3.0, 4.0, 6.0, 7.0] 28858.568988 15312.536560 1.000000 15069.000000 29981.000000 41513.000000 57317.000000
2 sales_dollars_value float64 254341 [13927.0, 10289.0, 357.0, 23113.0, 23177.0] 21594.541104 78180.565626 0.000000 523.000000 2655.000000 11765.000000 4395964.000000
3 sales_lbs_value float64 171749 [18680.0, 28646.0, 440.0, 81088.0, 58164.0] 12514.292899 47823.053345 0.000000 86.000000 611.000000 3770.000000 399173.765337
4 sales_units_value float64 71153 [934.0, 1592.0, 22.0, 2027.0, 3231.0] 3815.065102 11761.232981 1.000000 80.000000 403.000000 1807.000000 85720.276857
5 searchVolume float64 19 [42.0, 41.0, 416.0, 82.0, 2737.6345175977913] 75.472578 234.213209 2.000000 42.000000 42.000000 42.000000 2737.634518
6 system_calendar_key_N float64 196 [20160109.0, 20160116.0, 20160123.0, 20160130.0, 20160206.0] 20175054.752485 10735.371398 20160109.000000 20161231.000000 20171209.000000 20181103.000000 20191005.000000
7 target_encoded_Claim_name float64 45 [28727.111413533636, 5459.513712624824, 48528.06248297776, 15477.187623235048, 23814.15829608204] 21594.542805 10039.509580 517.027397 9757.032854 28727.111414 28727.111414 57203.664029
8 target_encoded_Vendor float64 9 [8444.867410442677, 61750.95991743127, 11005.1848864566, 39558.81575434599, 86181.82148253488] 21594.541105 19916.449630 8444.867410 8444.867410 11005.184886 39558.815754 86227.455719
9 target_encoded_platform float64 4 [22655.396989619167, 16500.746148209884, 10746.82539023493, 4259.73224703279] 21594.541103 2920.882736 4259.732247 22655.396990 22655.396990 22655.396990 22655.396990
10 week_number float64 14 [40.0, 22.0, 28.0, 31.0, 36.0] 37.755768 5.689954 9.804649 40.000000 40.000000 40.000000 40.000000

Summary Stats - Non Numeric Variables

  Variable Name Datatype No of Unique Samples Mode Mode Freq first last Mode Freq %
0 Claim_name category 45 ['No Claim', 'chicken', 'salmon', 'apple cinnamon', 'cookie'] No Claim 2045703 NaT NaT 45.197100
1 Vendor category 9 ['Others', 'A', 'Private Label', 'B', 'E'] Others 2195912 NaT NaT 48.515769
2 platform category 4 ['walmart', 'amazon', 'google', 'chewy'] walmart 3933788 NaT NaT 86.911839
3 date datetime64[ns] 22 [numpy.datetime64('NaT'), numpy.datetime64('2019-06-01T00:00:00.000000000'), numpy.datetime64('2019-07-11T00:00:00.000000000'), numpy.datetime64('2019-08-04T00:00:00.000000000'), numpy.datetime64('2019-09-02T00:00:00.000000000')] 2019-09-30 00:00:00 907287 2019-01-04 00:00:00 2019-09-30 00:00:00 52.455706

Distributions

Distributions - Numeric Variables

Distributions - Numeric Variables - Claim Id

Distributions - Numeric Variables - Product Id

Distributions - Numeric Variables - Sales Dollars Value

Distributions - Numeric Variables - Sales Lbs Value

Distributions - Numeric Variables - Sales Units Value

Distributions - Numeric Variables - Searchvolume

Distributions - Numeric Variables - System Calendar Key N

Distributions - Numeric Variables - Target Encoded Claim Name

Distributions - Numeric Variables - Target Encoded Vendor

Distributions - Numeric Variables - Target Encoded Platform

Distributions - Numeric Variables - Week Number

Distributions - Non Numeric Variables

Distributions - Non Numeric Variables - Claim Name

Distributions - Non Numeric Variables - Vendor

Distributions - Non Numeric Variables - Platform

Feature Normality

Feature Interactions

Correlation Table

  Variable 1 Variable 2 Corr Coef Abs Corr Coef
0 sales_dollars_value sales_lbs_value 0.778679 0.778679
1 sales_lbs_value sales_dollars_value 0.778679 0.778679
2 sales_dollars_value sales_units_value 0.554073 0.554073
3 sales_units_value sales_dollars_value 0.554073 0.554073
4 sales_dollars_value target_encoded_Vendor 0.254749 0.254749
5 target_encoded_Vendor sales_dollars_value 0.254749 0.254749
6 product_id sales_dollars_value -0.147133 0.147133
7 sales_dollars_value product_id -0.147133 0.147133
8 target_encoded_Claim_name sales_dollars_value 0.128414 0.128414
9 sales_dollars_value target_encoded_Claim_name 0.128414 0.128414
10 target_encoded_platform sales_dollars_value 0.037361 0.037361
11 sales_dollars_value target_encoded_platform 0.037361 0.037361
12 week_number sales_dollars_value 0.021904 0.021904
13 sales_dollars_value week_number 0.021904 0.021904
14 sales_dollars_value searchVolume -0.019538 0.019538
15 searchVolume sales_dollars_value -0.019538 0.019538
16 Claim_id sales_dollars_value -0.010143 0.010143
17 sales_dollars_value Claim_id -0.010143 0.010143
18 system_calendar_key_N sales_dollars_value -0.003643 0.003643
19 sales_dollars_value system_calendar_key_N -0.003643 0.003643

Correlation Heatmap

Covariance Heatmap

Bivariate Plots (top 50 Correlations)

Bivariate Plots (top 50 Correlations) - Sales Lbs Value Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Week Number Vs Target Encoded Platform

Bivariate Plots (top 50 Correlations) - Sales Units Value Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Sales Units Value Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs Sales Units Value

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Target Encoded Claim Name

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs Target Encoded Claim Name

Bivariate Plots (top 50 Correlations) - Week Number Vs Target Encoded Claim Name

Bivariate Plots (top 50 Correlations) - Target Encoded Claim Name Vs Sales Units Value

Bivariate Plots (top 50 Correlations) - Target Encoded Claim Name Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Target Encoded Claim Name Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Searchvolume Vs Claim Id

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Sales Units Value

Bivariate Plots (top 50 Correlations) - Week Number Vs Sales Units Value

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Target Encoded Vendor

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Product Id

Bivariate Plots (top 50 Correlations) - Week Number Vs Product Id

Bivariate Plots (top 50 Correlations) - Week Number Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs Claim Id

Bivariate Plots (top 50 Correlations) - Week Number Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - System Calendar Key N Vs Product Id

Bivariate Plots (top 50 Correlations) - Week Number Vs Target Encoded Vendor

Bivariate Plots (top 50 Correlations) - System Calendar Key N Vs Searchvolume

Bivariate Plots (top 50 Correlations) - System Calendar Key N Vs Claim Id

Bivariate Plots (top 50 Correlations) - System Calendar Key N Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Week Number Vs System Calendar Key N

Bivariate Plots (top 50 Correlations) - Sales Dollars Value Vs Claim Id

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs System Calendar Key N

Bivariate Plots (top 50 Correlations) - System Calendar Key N Vs Sales Units Value

Bivariate Plots (top 50 Correlations) - System Calendar Key N Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs System Calendar Key N

Bivariate Plots (top 50 Correlations) - Searchvolume Vs Sales Dollars Value

Bivariate Plots (top 50 Correlations) - Searchvolume Vs Sales Lbs Value

Bivariate Plots (top 50 Correlations) - Sales Lbs Value Vs Claim Id

Bivariate Plots (top 50 Correlations) - Week Number Vs Searchvolume

Bivariate Plots (top 50 Correlations) - Searchvolume Vs Product Id

Bivariate Plots (top 50 Correlations) - Searchvolume Vs Sales Units Value

Bivariate Plots (top 50 Correlations) - Target Encoded Vendor Vs Searchvolume

Bivariate Plots (top 50 Correlations) - Target Encoded Claim Name Vs Product Id

Bivariate Plots (top 50 Correlations) - Target Encoded Claim Name Vs System Calendar Key N

Bivariate Plots (top 50 Correlations) - Target Encoded Platform Vs Claim Id

Bivariate Plots (top 50 Correlations) - Target Encoded Claim Name Vs Claim Id

Bivariate Plots (top 50 Correlations) - Sales Units Value Vs Claim Id

Bivariate Plots (top 50 Correlations) - Week Number Vs Claim Id

Bivariate Plots (top 50 Correlations) - Sales Lbs Value Vs Product Id

Bivariate Plots (top 50 Correlations) - Sales Dollars Value Vs Product Id

Key Drivers

Sales Dollars Value

Sales Dollars Value - Feature Scores - Feature Correlation

Sales Dollars Value - Feature Importances - From Model

Sales Dollars Value - Pca Analysis

Sales Dollars Value - Pca Analysis - Pca Projection

Sales Dollars Value - Pca Analysis - Correlation With Dimension 2 (y)

Sales Dollars Value - Pca Analysis - Correlation With Dimension 1 (x)

Sales Dollars Value - Bivariate Plots

Sales Dollars Value - Bivariate Plots - System Calendar Key N

Sales Dollars Value - Bivariate Plots - Searchvolume

Sales Dollars Value - Bivariate Plots - Date

Sales Dollars Value - Bivariate Plots - Target Encoded Claim Name

Sales Dollars Value - Bivariate Plots - Product Id

Sales Dollars Value - Bivariate Plots - Sales Lbs Value

Sales Dollars Value - Bivariate Plots - Vendor

Sales Dollars Value - Bivariate Plots - Sales Units Value

Sales Dollars Value - Bivariate Plots - Claim Name

Sales Dollars Value - Bivariate Plots - Platform

Sales Dollars Value - Bivariate Plots - Week Number

Sales Dollars Value - Bivariate Plots - Target Encoded Vendor

Sales Dollars Value - Bivariate Plots - Claim Id

Sales Dollars Value - Bivariate Plots - Target Encoded Platform